Search Result

Select

Federated learning algorithm based on personalized differential privacy

Chunyong YIN, Rui QU

Journal of Computer Applications 2023, 43 (4): 1160-1168. DOI: 10.11772/j.issn.1001-9081.2022030337

Abstract （737）

HTML （35）

PDF （1800KB）（474）

Save

Federated Learning （FL） can effectively protect users' personal data from attackers. Differential Privacy （DP） is applied to enhance the privacy of FL， which can solve the problem of privacy disclose caused by parameters in the model training. However， existing FL methods based on DP on concentrate on the unified privacy protection budget and ignore the personalized privacy requirements of users. To solve this problem， a two-stage Federated Learning with Personalized Differential Privacy （PDP-FL） algorithm was proposed. In the first stage， the user's privacy was graded according to the user's privacy preference， and the noise meeting the user's privacy preference was added to achieve the purpose of personalized privacy protection. At the same time， the privacy level corresponding to the privacy preference was uploaded to the central aggregation server. In the second stage， in order to fully protect the global data， the simultaneous local and central protection strategy was adopted. And according to the privacy level uploaded by the user， the noise conforming to the global DP threshold was added to quantify the global privacy protection level. Experimental results show that on MNIST and CIFAR-10 datasets， the classification accuracy of PDP-FL algorithm reaches 93.8% to 94.5% and 43.4% to 45.2% respectively， which is better than those of Federated learning with Local Differential Privacy （LDP-Fed） algorithm and Federated Learning with Global Differential Privacy （GDP-FL） algorithm， PDP-FL algorithm meets the needs of personalized privacy protection.

Table and Figures | Reference | Related Articles | Metrics

Select

Unsupervised time series anomaly detection model based on re-encoding

Chunyong YIN, Liwen ZHOU

Journal of Computer Applications 2023, 43 (3): 804-811. DOI: 10.11772/j.issn.1001-9081.2022010006

Abstract （628）

HTML （43）

PDF （1769KB）（321）

Save

In order to deal with the problem of low accuracy of anomaly detection caused by data imbalance and highly complex temporal correlation of time series， a re-encoding based unsupervised time series anomaly detection model based on Generative Adversarial Network （GAN）， named RTGAN （Re-encoding Time series based on GAN）， was proposed. Firstly， multiple generators with cycle consistency were used to ensure the diversity of generated samples and thereby learning different anomaly patterns. Secondly， the stacked Long Short-Term Memory-dropout Recurrent Neural Network （LSTM-dropout RNN） was used to capture temporal correlation. Thirdly， the differences between the generated samples and the real samples were compared in the latent space by improved re-encoding. As the re-encoding errors， these differences were served as a part of anomaly score to improve the accuracy of anomaly detection. Finally， the new anomaly score was used to detect anomalies on univariate and multivariate time series datasets. The proposed model was compared with seven baseline anomaly detection models on univariate and multivariate time series. Experimental results show that the proposed model obtains the highest average F1-score （0.815） on all datasets. And the overall performance of the proposed model is 36.29% and 8.52% respectively higher than those of the original AutoEncoder （AE） model Dense-AE （Dense-AutoEncoder） and latest benchmark model USAD （UnSupervised Anomaly Detection on multivariate time series）. The robustness of the model was detected by different Signal-to-Noise Ratio （SNR）. The results show that the proposed model consistently outperforms LSTM-VAE （Variational Autoencoder based on LSTM）， USAD and OmniAnomaly， especially in the case of 30% SNR， the F1-score of RTGAN is 13.53% and 10.97% respectively higher than those of USAD and OmniAnomaly. It can be seen that RTGAN can effectively improve the accuracy and robustness of anomaly detection.

Table and Figures | Reference | Related Articles | Metrics

Select

Fast sanitization algorithm based on BCU-Tree and dictionary for high-utility mining

Chunyong YIN, Ying LI

Journal of Computer Applications 2023, 43 (2): 413-422. DOI: 10.11772/j.issn.1001-9081.2021122161

Abstract （276）

HTML （10）

PDF （2958KB）（95）

Save

Privacy Preserving Utility Mining （PPUM） has problems of long sanitization time， high computational complexity， and high side effect. To solve these problems， a fast sanitization algorithm based on BCU-Tree and Dictionary （BCUTD） for high-utility mining was proposed. In the algorithm， a new tree structure called BCU-Tree was presented to store sensitive item information， and based on the bitwise operator coding model， the tree construction time and search space were reduced. The dictionary table was used to store all nodes in the tree structure， and only the dictionary table needed to be accessed when the sensitive item was modified. Finally， the sanitization process was completed. In the experiments on four different datasets， BCUTD algorithm has better performance on sanitization time and high side effect than Hiding High Utility Item First （HHUIF）， Maximum Sensitive Utility-MAximum item Utility （MSU-MAU）， and Fast Perturbation Using Tree and Table structures （FPUTT）. Experimental results show that BCUTD algorithm can effectively speed up the sanitization process， reduce the side effect and computational complexity of the algorithm.

Table and Figures | Reference | Related Articles | Metrics

Select

Unsupervised log anomaly detection model based on CNN and Bi-LSTM

Chunyong YIN, Yangchun ZHANG

Journal of Computer Applications 2023, 43 (11): 3510-3516. DOI: 10.11772/j.issn.1001-9081.2022111738

Abstract （252）

HTML （8）

PDF （1759KB）（285）

Save

Logs can record the specific status of the system during the operation， and automated log anomaly detection is critical to network security. Concerning the problem of low accuracy in anomaly detection caused by the evolution of log sentences over time， an unsupervised log anomaly detection model LogCL was proposed. Firstly， the log parsing technique was used to convert semi-structured log data into structured log templates. Secondly， the sessions and fixed windows were employed to divide log events into log sequences. Thirdly， quantitative characteristics of the log sequences were extracted， natural language processing technique was used to extract semantic features of log templates， and Term Frequency-Inverse Word Frequency （TF-IWF） algorithm was utilized to generate weighted sentence embedding vectors. Finally， the feature vectors were input into a parallel model based on Convolutional Neural Network （CNN） and Bi-directional Long Short-Term Memory （Bi-LSTM） network for detection. Experimental results on two public real datasets show that the proposed model improves the anomaly detection F1-score by 3.6 and 2.3 percentage points respectively compared with the baseline model LogAnomaly. Therefore， LogCL can perform effectively on log anomaly detection.

Table and Figures | Reference | Related Articles | Metrics